Mohammed E. Safi; Eyad I. Abbas
Abstract
Speech recognition is widely used in robot control and automation. Nevertheless, the use of speech recognition in robots is limited due to its susceptibility to background noise. This ...
Read More ...
Speech recognition is widely used in robot control and automation. Nevertheless, the use of speech recognition in robots is limited due to its susceptibility to background noise. This paper proposes a speech recognition algorithm to control robots in noisy environments. The proposed algorithm is based on Perceptual Linear Predictive Cepstral Coefficients (PNCC), which is a noise-resistant feature extraction technique, and Modified K-Nearest Neighbors (KNN) with Dynamic Time Warping (DTW) as the classifier. A new KNN-DTW classifier is proposed, integrating weighted KNN and DTW. The proposed algorithm results from experiments comparing PNCC and Mel-frequency cepstral coefficients (MFCC) feature extraction techniques with different classifiers, namely KNN-DTW, two types of KNN (weighted KNN and Medium-KNN), and two types of Support Vector Machine SVM (Linear SVM and Quadratic SVM). The database used to investigate the accuracy was the audio-visual data corpus database UOTletters, which includes 30 speakers, 26 English letters, and 1560 utterances. The database is divided into 50% for training and 50% for testing purposes. In a noise-free environment, the accuracy of the proposed algorithm reached 100%. Moreover, the proposed algorithm demonstrates greater noise immunity across all five noise levels, with an average accuracy difference of 13.67% compared to baseline algorithms.